differentially private reinforcement learning
Differentially Private Reinforcement Learning with Self-Play
We study the problem of multi-agent reinforcement learning (multi-agent RL) with differential privacy (DP) constraints. This is well-motivated by various real-world applications involving sensitive data, where it is critical to protect users' private information. We first extend the definitions of Joint DP (JDP) and Local DP (LDP) to two-player zero-sum episodic Markov Games, where both definitions ensure trajectory-wise privacy protection. Then we design a provably efficient algorithm based on optimistic Nash value iteration and privatization of Bernstein-type bonuses. The algorithm is able to satisfy JDP and LDP requirements when instantiated with appropriate privacy mechanisms.
differentially private reinforcement learning, multi-agent rl, trajectory-wise privacy protection, (3 more...)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)